Passaic
Translating Embeddings for Modeling Multi-relational Data
We consider the problem of embedding entities and relationships of multirelational data in low-dimensional vector spaces. Our objective is to propose a canonical model which is easy to train, contains a reduced number of parameters and can scale up to very large databases. Hence, we propose TransE, a method which models relationships by interpreting them as translations operating on the low-dimensional embeddings of the entities. Despite its simplicity, this assumption proves to be powerful since extensive experiments show that TransE significantly outperforms state-of-the-art methods in link prediction on two knowledge bases. Besides, it can be successfully trained on a large scale data set with 1M entities, 25k relationships and more than 17M training samples.
- North America > United States > Pennsylvania > Bucks County (0.14)
- North America > United States > New Jersey > Ocean County (0.14)
- North America > United States > New Jersey > Atlantic County (0.14)
- (10 more...)
- Media > Film (1.00)
- Leisure & Entertainment > Sports (0.68)
Clustering US Counties to Find Patterns Related to the COVID-19 Pandemic
Brown, Cora, Milstein, Sarah, Sun, Tianyi, Zhao, Cooper
When COVID-19 first started spreading and quarantine was implemented, the Society for Industrial and Applied Mathematics (SIAM) Student Chapter at the University of Minnesota-Twin Cities began a collaboration with Ecolab to use our skills as data scientists and mathematicians to extract useful insights from relevant data relating to the pandemic. This collaboration consisted of multiple groups working on different projects. In this write-up we focus on using clustering techniques to help us find groups of similar counties in the US and use that to help us understand the pandemic. Our team for this project consisted of University of Minnesota students Cora Brown, Sarah Milstein, Tianyi Sun, and Cooper Zhao, with help from Ecolab Data Scientist Jimmy Broomfield and University of Minnesota student Skye Ke. In the sections below we describe all of the work done for this project. In Section 2, we list the data we gathered, as well as the feature engineering we performed. In Section 3, we describe the metrics we used for evaluating our models. In Section 4, we explain the methods we used for interpreting the results of our various clustering approaches. In Section 5, we describe the different clustering methods we implemented. In Section 6, we present the results of our clustering techniques and provide relevant interpretation. Finally, in Section 7, we provide some concluding remarks comparing the different clustering methods.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Michigan > Wayne County > Wayne (0.04)
- North America > United States > Texas > Dallas County > Dallas (0.04)
- (26 more...)
- Health & Medicine > Epidemiology (0.86)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.63)
- Health & Medicine > Therapeutic Area > Immunology (0.63)
4 questions with Rush CIO Dr. Shafiq Rab
Dr. Shafiq Rab, CIO of Rush University Medical Center in Chicago, uses his background in public health to inform his IT vision. Dr. Rab, who completed his medical degree and internal medicine residency at Karachi, Pakistan-based Dow Medical College, had his interest in public health piqued during one of his first physician jobs. While treating an urban squatters settlement in Pakistan, he worked with non-governmental organizations to address the infant mortality rate, mainly by bringing clean drinking water to its residents. "That's how I got involved in healthcare," he says. "And I remain committed to healthcare.
- North America > United States > Illinois > Cook County > Chicago (0.26)
- Asia > Pakistan > Sindh > Karachi Division > Karachi (0.25)
- North America > United States > New York > Orange County > Middletown (0.05)
- (2 more...)
Translating Embeddings for Modeling Multi-relational Data
Bordes, Antoine, Usunier, Nicolas, Garcia-Duran, Alberto, Weston, Jason, Yakhnenko, Oksana
We consider the problem of embedding entities and relationships of multi-relational data in low-dimensional vector spaces. Our objective is to propose a canonical model which is easy to train, contains a reduced number of parameters and can scale up to very large databases. Hence, we propose, TransE, a method which models relationships by interpreting them as translations operating on the low-dimensional embeddings of the entities. Despite its simplicity, this assumption proves to be powerful since extensive experiments show that TransE significantly outperforms state-of-the-art methods in link prediction on two knowledge bases. Besides, it can be successfully trained on a large scale data set with 1M entities, 25k relationships and more than 17M training samples.
- North America > United States > Pennsylvania > Bucks County (0.14)
- North America > United States > New Jersey > Ocean County (0.14)
- North America > United States > New Jersey > Atlantic County (0.14)
- (10 more...)
- Media > Film (1.00)
- Leisure & Entertainment > Sports (0.68)